Singing Voice Separation from Monaural Music Based on Kernel Back-Fitting Using Beta-Order Spectral Amplitude Estimation
نویسندگان
چکیده
Separating the leading singing voice from the musical background from a monaural recording is a challenging task that appears naturally in several music processing applications. Recently, kernel additive modeling with generalized spatial Wiener filtering (GW) was presented for music/voice separation. In this paper, an adaptive auditory filtering based on β-order minimum mean-square error spectral amplitude estimation (bSA) is applied to the kernel additive modeling for improving the singing voice separation performance from monaural music signal. The proposed algorithm is composed of five modules: short time Fourier transform, music/voice separation based on bSA, determination of back-fitting, back-fitting, and inverse short time Fourier transform. In the proposed method, the Singular Value Decomposition (SVD)-based factorized spectral amplitude exponent β for each kernel component is adaptively calculated for effective bSAbased auditory filtering performance during kernel backfitting. Using a back-fitting threshold, the kernel backfitting process can automatically be iteratively performed until convergence. Experimental results show that the proposed method achieves better separation performance than GW based on kernel additive modeling.
منابع مشابه
Vocal separation from monaural music using adaptive auditory filtering based on kernel back-fitting
Recently, kernel additive modeling with generalized spatial Wiener filtering (GW) was presented for music/voice separation. In this paper, an adaptive auditory filtering, called generalized weighted β-order MMSE estimation (WbE), is applied to the basic iterative kernel back-fitting algorithm for improving the separation performance of monaural music signal into music/voice components. In the p...
متن کاملPitch Estimation of Singing Voice From Monaural Popular Music Recordings
A singing voice separation system is a hard yet popular task in the field of music information retrieval (MIR). If successfully separated, a number of algorithms can be applied to vocal melody for any possible application. In this study, we applied a pitch estimation algorithm after separating a singing voice from background music based on the implementation of REPET [1]. Then we evaluated our ...
متن کاملSinging Voice Separation from Monaural Recordings
Separating singing voice from music accompaniment has wide applications in areas such as automatic lyrics recognition and alignment, singer identification, and music information retrieval. Compared to the extensive studies of speech separation, singing voice separation has been little explored. We propose a system to separate singing voice from music accompaniment from monaural recordings. The ...
متن کاملSpectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separation
A spectro-temporal modulation based singing voice detection cascaded with a Viterbi based pitch tracking algorithm is proposed in this paper for singing-voice separation from monaural recordings. To detect the singing voice, the spectrotemporal modulation energy related to voice harmonics is extracted using a spectro-temporal modulation analysis framework developed for the Fourier spectrogram. ...
متن کاملSinging Voice Separation Using Spectro-Temporal Modulation Features
An auditory-perception inspired singing voice separation algorithm for monaural music recordings is proposed in this paper. Under the framework of computational auditory scene analysis (CASA), the music recordings are first transformed into auditory spectrograms. After extracting the spectral-temporal modulation contents of the timefrequency (T-F) units through a two-stage auditory model, we de...
متن کامل